Modeling replication strategies in data grid systems with arbitrary clustered demands
نویسندگان
چکیده
This paper considers the relationship between request distribution and replica distribution in data grid when request exhibits arbitrary clustered demands. We first give formal model of replication strategies in data grid system. Second, we investigate what is optimal way at the objective of minimizing average access latency to replicate data when request exhibits arbitrary clustered demands. We explain why replicas should be replicated uniformly when request is uniformly distributed in a sub grid in the sense of optimal strategy. Then we investigate the relationship between different files in a sub grid. Furthermore, we analyze the case when all sub grids are equal-sized and conclude that when request is uniformly distributed in system, replicas should be uniformly distributed in system too. Finally, we give an optimal strategy when sub grids are not equal-sized and different sub grids exhibit different request clustering patterns. Compared with some popular strategies, the optimal strategy has some advantages of lower wide area network bandwidth requirement and lower average access latency. Simulation results validate the effectiveness of optimal strategy.
منابع مشابه
A Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment
Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملImproving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy
Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...
متن کاملDynamic Replication based on Firefly Algorithm in Data Grid
In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008